preprint submitted
Advancing Embodied Intelligence in Robotic-Assisted Endovascular Procedures: A Systematic Review of AI Solutions
Yao, Tianliang, Lu, Bo, Kowarschik, Markus, Yuan, Yixuan, Zhao, Hubin, Ourselin, Sebastien, Althoefer, Kaspar, Ge, Junbo, Qi, Peng
Endovascular procedures have revolutionized vascular disease treatment, yet their manual execution is challenged by the demands for high precision, operator fatigue, and radiation exposure. Robotic systems have emerged as transformative solutions to mitigate these inherent limitations. A pivotal moment has arrived, where a confluence of pressing clinical needs and breakthroughs in AI creates an opportunity for a paradigm shift toward Embodied Intelligence (EI), enabling robots to navigate complex vascular networks and adapt to dynamic physiological conditions. Data-driven approaches, leveraging advanced computer vision, medical image analysis, and machine learning, drive this evolution by enabling real-time vessel segmentation, device tracking, and anatomical landmark detection. Reinforcement learning and imitation learning further enhance navigation strategies and replicate expert techniques. This review systematically analyzes the integration of EI into endovascular robotics, identifying profound systemic challenges such as the heterogeneity in validation standards and the gap between human mimicry and machine-native capabilities. Based on this analysis, a conceptual roadmap is proposed that reframes the ultimate objective away from systems that supplant clinical decision-making. This vision of augmented intelligence, where the clinician's role evolves into that of a high-level supervisor, provides a principled foundation for the future of the field.
Augmented Physics-Based Li-ion Battery Model via Adaptive Ensemble Sparse Learning and Conformal Prediction
da Silva, Samuel Filgueira, Ozkan, Mehmet Fatih, Idrissi, Faissal El, Canova, Marcello
--Accurate electrochemical models are essential for the safe and efficient operation of lithium-ion batteries in real-world applications such as electrified vehicles and grid storage. Reduced-order models (ROM) offer a balance between fidelity and computational efficiency but often struggle to capture complex and nonlinear behaviors, such as the dynamics in the cell voltage response under high C-rate conditions. T o address these limitations, this study proposes an Adaptive Ensemble Sparse Identification (AESI) framework that enhances the accuracy of reduced-order li-ion battery models by compensating for unpredictable dynamics. The approach integrates an Extended Single Particle Model (ESPM) with an evolutionary ensemble sparse learning strategy to construct a robust hybrid model. In addition, the AESI framework incorporates a conformal prediction method to provide theoretically guaranteed uncertainty quantification for voltage error dynamics, thereby improving the reliability of the model's predictions. Evaluation across diverse operating conditions shows that the hybrid model (ESPM + AESI) improves the voltage prediction accuracy, achieving mean squared error reductions of up to 46% on unseen data. Prediction reliability is further supported by conformal prediction, yielding statistically valid prediction intervals with coverage ratios of 96.85% and 97.41% for the ensemble models based on bagging and stability selection, respectively.
Subspace-wise Hybrid RL for Articulated Object Manipulation
Kim, Yujin, Choi, Sol, You, Bum-Jae, Jang, Keunwoo, Lee, Yisoo
Articulated object manipulation is a challenging task, requiring constrained motion and adaptive control to handle the unknown dynamics of the manipulated objects. While reinforcement learning (RL) has been widely employed to tackle various scenarios and types of articulated objects, the complexity of these tasks, stemming from multiple intertwined objectives makes learning a control policy in the full task space highly difficult. To address this issue, we propose a Subspace-wise hybrid RL (SwRL) framework that learns policies for each divided task space, or subspace, based on independent objectives. This approach enables adaptive force modulation to accommodate the unknown dynamics of objects. Additionally, it effectively leverages the previously underlooked redundant subspace, thereby maximizing the robot's dexterity. Our method enhances both learning efficiency and task execution performance, as validated through simulations and real-world experiments. Supplementary video is available at https://youtu.be/PkNxv0P8Atk
On-device Anomaly Detection in Conveyor Belt Operations
Martinez-Rau, Luciano S., Zhang, Yuxuan, Oelmann, Bengt, Bader, Sebastian
Mining 4.0 leverages advancements in automation, digitalization, and interconnected technologies from Industry 4.0 to address the unique challenges of the mining sector, enhancing efficiency, safety, and sustainability. Conveyor belts are crucial in mining operations by enabling the continuous and efficient movement of bulk materials over long distances, which directly impacts productivity. While detecting anomalies in specific conveyor belt components, such as idlers, pulleys, and belt surfaces, has been widely studied, identifying the root causes of these failures remains critical due to factors like changing production conditions and operator errors. Continuous monitoring of mining conveyor belt work cycles for anomaly detection is still at an early stage and requires robust solutions. This study proposes two distinctive pattern recognition approaches for real-time anomaly detection in the operational cycles of mining conveyor belts, combining feature extraction, threshold-based cycle detection, and tiny machine-learning classification. Both approaches outperformed a state-of-the-art technique on two datasets for duty cycle classification in terms of F1-scores. The first approach, with 97.3% and 80.2% for normal and abnormal cycles, respectively, reaches the highest performance in the first dataset while the second approach excels on the second dataset, scoring 91.3% and 67.9%. Implemented on two low-power microcontrollers, the methods demonstrated efficient, real-time operation with energy consumption of 13.3 and 20.6 ${\mu}$J during inference. These results offer valuable insights for detecting mechanical failure sources, supporting targeted preventive maintenance, and optimizing production cycles.
Efficient Collaborative Navigation through Perception Fusion for Multi-Robots in Unknown Environments
Lin, Qingquan, Lu, Weining, Meng, Litong, Li, Chenxi, Liang, Bin
For tasks conducted in unknown environments with efficiency requirements, real-time navigation of multi-robot systems remains challenging due to unfamiliarity with surroundings.In this paper, we propose a novel multi-robot collaborative planning method that leverages the perception of different robots to intelligently select search directions and improve planning efficiency. Specifically, a foundational planner is employed to ensure reliable exploration towards targets in unknown environments and we introduce Graph Attention Architecture with Information Gain Weight(GIWT) to synthesizes the information from the target robot and its teammates to facilitate effective navigation around obstacles.In GIWT, after regionally encoding the relative positions of the robots along with their perceptual features, we compute the shared attention scores and incorporate the information gain obtained from neighboring robots as a supplementary weight. We design a corresponding expert data generation scheme to simulate real-world decision-making conditions for network training. Simulation experiments and real robot tests demonstrates that the proposed method significantly improves efficiency and enables collaborative planning for multiple robots. Our method achieves approximately 82% accuracy on the expert dataset and reduces the average path length by about 8% and 6% across two types of tasks compared to the fundamental planner in ROS tests, and a path length reduction of over 6% in real-world experiments.
A Systematic Review of Edge Case Detection in Automated Driving: Methods, Challenges and Future Directions
Rahmani, Saeed, Rieder, Sabine, de Gelder, Erwin, Sonntag, Marcel, Mallada, Jorge Lorente, Kalisvaart, Sytze, Hashemi, Vahid, Calvert, Simeon C.
The rapid development of automated vehicles (AVs) promises to revolutionize transportation by enhancing safety and efficiency. However, ensuring their reliability in diverse real-world conditions remains a significant challenge, particularly due to rare and unexpected situations known as edge cases. Although numerous approaches exist for detecting edge cases, there is a notable lack of a comprehensive survey that systematically reviews these techniques. This paper fills this gap by presenting a practical, hierarchical review and systematic classification of edge case detection and assessment methodologies. Our classification is structured on two levels: first, categorizing detection approaches according to AV modules, including perception-related and trajectory-related edge cases; and second, based on underlying methodologies and theories guiding these techniques. We extend this taxonomy by introducing a new class called "knowledge-driven" approaches, which is largely overlooked in the literature. Additionally, we review the techniques and metrics for the evaluation of edge case detection methods and identified edge cases. To our knowledge, this is the first survey to comprehensively cover edge case detection methods across all AV subsystems, discuss knowledge-driven edge cases, and explore evaluation techniques for detection methods. This structured and multi-faceted analysis aims to facilitate targeted research and modular testing of AVs. Moreover, by identifying the strengths and weaknesses of various approaches and discussing the challenges and future directions, this survey intends to assist AV developers, researchers, and policymakers in enhancing the safety and reliability of automated driving (AD) systems through effective edge case detection.
Denoising Plane Wave Ultrasound Images Using Diffusion Probabilistic Models
Asgariandehkordi, Hojat, Goudarzi, Sobhan, Sharifzadeh, Mostafa, Basarab, Adrian, Rivaz, Hassan
Ultrasound plane wave imaging is a cutting-edge technique that enables high frame-rate imaging. However, one challenge associated with high frame-rate ultrasound imaging is the high noise associated with them, hindering their wider adoption. Therefore, the development of a denoising method becomes imperative to augment the quality of plane wave images. Drawing inspiration from Denoising Diffusion Probabilistic Models (DDPMs), our proposed solution aims to enhance plane wave image quality. Specifically, the method considers the distinction between low-angle and high-angle compounding plane waves as noise and effectively eliminates it by adapting a DDPM to beamformed radiofrequency (RF) data. The method underwent training using only 400 simulated images. In addition, our approach employs natural image segmentation masks as intensity maps for the generated images, resulting in accurate denoising for various anatomy shapes. The proposed method was assessed across simulation, phantom, and in vivo images. The results of the evaluations indicate that our approach not only enhances image quality on simulated data but also demonstrates effectiveness on phantom and in vivo data in terms of image quality. Comparative analysis with other methods underscores the superiority of our proposed method across various evaluation metrics. The source code and trained model will be released along with the dataset at: http://code.sonography.ai
Hybrid X-Linker: Automated Data Generation and Extreme Multi-label Ranking for Biomedical Entity Linking
Ruas, Pedro, Gallego, Fernando, Veredas, Francisco J., Couto, Francisco M.
State-of-the-art deep learning entity linking methods rely on extensive human-labelled data, which is costly to acquire. Current datasets are limited in size, leading to inadequate coverage of biomedical concepts and diminished performance when applied to new data. In this work, we propose to automatically generate data to create large-scale training datasets, which allows the exploration of approaches originally developed for the task of extreme multi-label ranking in the biomedical entity linking task. We propose the hybrid X-Linker pipeline that includes different modules to link disease and chemical entity mentions to concepts in the MEDIC and the CTD-Chemical vocabularies, respectively. X-Linker was evaluated on several biomedical datasets: BC5CDR-Disease, BioRED-Disease, NCBI-Disease, BC5CDR-Chemical, BioRED-Chemical, and NLM-Chem, achieving top-1 accuracies of 0.8307, 0.7969, 0.8271, 0.9511, 0.9248, and 0.7895, respectively. X-Linker demonstrated superior performance in three datasets: BC5CDR-Disease, NCBI-Disease, and BioRED-Chemical. In contrast, SapBERT outperformed X-Linker in the remaining three datasets. Both models rely only on the mention string for their operations. The source code of X-Linker and its associated data are publicly available for performing biomedical entity linking without requiring pre-labelled entities with identifiers from specific knowledge organization systems.
Magnetic Hysteresis Modeling with Neural Operators
Chandra, Abhishek, Daniels, Bram, Curti, Mitrofan, Tiels, Koen, Lomonova, Elena A.
Hysteresis modeling is crucial to comprehend the behavior of magnetic devices, facilitating optimal designs. Hitherto, deep learning-based methods employed to model hysteresis, face challenges in generalizing to novel input magnetic fields. This paper addresses the generalization challenge by proposing neural operators for modeling constitutive laws that exhibit magnetic hysteresis by learning a mapping between magnetic fields. In particular, two prominent neural operators -- deep operator network and Fourier neural operator -- are employed to predict novel first-order reversal curves and minor loops, where novel means they are not used to train the model. In addition, a rate-independent Fourier neural operator is proposed to predict material responses at sampling rates different from those used during training to incorporate the rate-independent characteristics of magnetic hysteresis. The presented numerical experiments demonstrate that neural operators efficiently model magnetic hysteresis, outperforming the traditional neural recurrent methods on various metrics and generalizing to novel magnetic fields. The findings emphasize the advantages of using neural operators for modeling hysteresis under varying magnetic conditions, underscoring their importance in characterizing magnetic material based devices.
CReMa: Crisis Response through Computational Identification and Matching of Cross-Lingual Requests and Offers Shared on Social Media
Lamsal, Rabindra, Read, Maria Rodriguez, Karunasekera, Shanika, Imran, Muhammad
During times of crisis, social media platforms play a vital role in facilitating communication and coordinating resources. Amidst chaos and uncertainty, communities often rely on these platforms to share urgent pleas for help, extend support, and organize relief efforts. However, the sheer volume of conversations during such periods, which can escalate to unprecedented levels, necessitates the automated identification and matching of requests and offers to streamline relief operations. This study addresses the challenge of efficiently identifying and matching assistance requests and offers on social media platforms during emergencies. We propose CReMa (Crisis Response Matcher), a systematic approach that integrates textual, temporal, and spatial features for multi-lingual request-offer matching. By leveraging CrisisTransformers, a set of pre-trained models specific to crises, and a cross-lingual embedding space, our methodology enhances the identification and matching tasks while outperforming strong baselines such as RoBERTa, MPNet, and BERTweet, in classification tasks, and Universal Sentence Encoder, Sentence Transformers in crisis embeddings generation tasks. We introduce a novel multi-lingual dataset that simulates scenarios of help-seeking and offering assistance on social media across the 16 most commonly used languages in Australia. We conduct comprehensive cross-lingual experiments across these 16 languages, also while examining trade-offs between multiple vector search strategies and accuracy. Additionally, we analyze a million-scale geotagged global dataset to comprehend patterns in relation to seeking help and offering assistance on social media. Overall, these contributions advance the field of crisis informatics and provide benchmarks for future research in the area.